AITopics

Country:

Europe (1.00)
North America > United States > Massachusetts (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)

Neural Information Processing SystemsFeb-10-2026, 02:59:53 GMT

2ae6b2bdf3a179e3e24129e2c54bd871-Paper-Conference.pdf

original performance 0, performance 0, performance aal, (16 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.58)

Neural Information Processing SystemsFeb-9-2026, 14:05:45 GMT

2dbb8bfe4cd3875609b23799830ee865-Paper-Conference.pdf

information, participatory system, personalization, (15 more...)

Country:

Europe (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Asia (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(7 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.67)

Neural Information Processing SystemsFeb-7-2026, 17:14:11 GMT

1d8d70dddf147d2d92a634817f01b239-AuthorFeedback.pdf

dataset, experiment, performance 0, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.31)

Neural Information Processing SystemsOct-9-2025, 21:48:34 GMT

2ae6b2bdf3a179e3e24129e2c54bd871-Paper-Conference.pdf

original performance 0, performance 0, performance aal, (15 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Neural Information Processing SystemsOct-2-2025, 09:06:21 GMT

we changed the objective function so that a specified level of fairness is guaranteed; (2) we performed experiments

We thank all of the reviewers for their helpful reviews. Adversarial Learning", and we show that it does not outperform our methods. Please see the details below. We agree that our original objective function has no guarantee that the outcome is fair. See the table below (row 1). See the table below (row 2). See the table (row 3) and the figure below. As suggested, we compared our approaches to Zhang et al. "Mitigating Unwanted Biases with Adversarial Finally, we will improve the visibility of Figure 1 in our paper.

artificial intelligence, dataset, machine learning, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Schweighofer, Kajetan, Arnaiz-Rodriguez, Adrian, Hochreiter, Sepp, Oliver, Nuria

The Disparate Benefits of Deep Ensembles

arXiv.org Artificial IntelligenceOct-17-2024

Ensembles of Deep Neural Networks, Deep Ensembles, are widely used as a simple way to boost predictive performance. However, their impact on algorithmic fairness is not well understood yet. Algorithmic fairness investigates how a model's performance varies across different groups, typically defined by protected attributes such as age, gender, or race. In this work, we investigate the interplay between the performance gains from Deep Ensembles and fairness. Our analysis reveals that they unevenly favor different groups in what we refer to as a disparate benefits effect. We empirically investigate this effect with Deep Ensembles applied to popular facial analysis and medical imaging datasets, where protected group attributes are given and find that it occurs for multiple established group fairness metrics, including statistical parity and equal opportunity. Furthermore, we identify the per-group difference in predictive diversity of ensemble members as the potential cause of the disparate benefits effect. Finally, we evaluate different approaches to reduce unfairness due to the disparate benefits effect. Our findings show that post-processing is an effective method to mitigate this unfairness while preserving the improved performance of Deep Ensembles.

artificial intelligence, ensemble member, machine learning, (14 more...)

2410.13831

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Austria > Upper Austria > Linz (0.04)
Europe > Spain > Valencian Community > Alicante Province > Alicante (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Reif, Yuval, Schwartz, Roy

Beyond Performance: Quantifying and Mitigating Label Bias in LLMs

arXiv.org Artificial IntelligenceMay-4-2024

Large language models (LLMs) have shown remarkable adaptability to diverse tasks, by leveraging context prompts containing instructions, or minimal input-output examples. However, recent work revealed they also exhibit label bias -- an undesirable preference toward predicting certain answers over others. Still, detecting and measuring this bias reliably and at scale has remained relatively unexplored. In this study, we evaluate different approaches to quantifying label bias in a model's predictions, conducting a comprehensive investigation across 279 classification tasks and ten LLMs. Our investigation reveals substantial label bias in models both before and after debiasing attempts, as well as highlights the importance of outcomes-based evaluation metrics, which were not previously used in this regard. We further propose a novel label bias calibration method tailored for few-shot prompting, which outperforms recent calibration approaches for both improving performance and mitigating label bias. Our results emphasize that label bias in the predictions of LLMs remains a barrier to their reliability.

demonstration, label bias, prediction, (14 more...)

2405.02743

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

arXiv.org Artificial IntelligenceNov-23-2022

Implementation and Evaluation of a System for Assessment of The Quality of Long-Term Management of Patients at a Geriatric Hospital

Shalom, Erez, Goldstein, Ayelet, Wais, Roni, Slivanova, Maya, Cohen, Nogah Melamed, Shahar, Yuval

Background The use of a clinical decision support system for assessing the quality of care, based on computerized clinical guidelines (GLs), is likely to improve care, reduce costs, save time, and enhance the staff's capabilities. Objectives Implement and evaluate a system for assessment of the quality of the care, in the domain of management of pressure ulcers, by investigating the level of compliance of the staff to the GLs. Methods Using data for 100 random patients from the local EMR system we performed a technical evaluation, checking the applicability and usability, followed by a functional evaluation of the system investigating the quality metrics given to the compliance of the medical's staff to the protocol. We compared the scores given by the nurse when supported by the system, to the scores given by the nurse without the system's support, and to the scores given by the system. We also measured the time taken to perform the assessment with and without the system's support. Results There were no significant differences in the scores of most measures given by the nurse using the system, compared to the scores given by the system. There were also no significant differences across the values of most quality measures given by the nurse without support compared to the values given by the nurse with support. Using the system, however, significantly reduced the nurse's average assessment time. Conclusions Using an automated quality-assessment system, may enable a senior nurse, to quickly and accurately assess the quality of care. In addition to its accuracy, the system considerably reduces the time taken to assess the various quality measures.

artificial intelligence, knowledge management, nurse, (17 more...)

2211.12904

Country:

North America > United States (0.93)
Asia > Middle East > Israel (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.74)

Industry:

Health & Medicine > Health Care Providers & Services (1.00)
Government > Regional Government > North America Government > United States Government (0.67)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Knowledge Management > Knowledge Engineering (0.69)

Jorge, Emilio, Kågebäck, Mikael, Johansson, Fredrik D., Gustavsson, Emil

Learning to Play Guess Who? and Inventing a Grounded Language as a Consequence

arXiv.org Artificial IntelligenceMar-15-2017

Acquiring your first language is an incredible feat and not easily duplicated. Learning to communicate using nothing but a few pictureless books, a corpus, would likely be impossible even for humans. Nevertheless, this is the dominating approach in most natural language processing today. As an alternative, we propose the use of situated interactions between agents as a driving force for communication, and the framework of Deep Recurrent Q-Networks for evolving a shared language grounded in the provided environment. We task the agents with interactive image search in the form of the game Guess Who?. The images from the game provide a non trivial environment for the agents to discuss and a natural grounding for the concepts they decide to encode in their communication. Our experiments show that the agents learn not only to encode physical concepts in their words, i.e. grounding, but also that the agents learn to hold a multi-step dialogue remembering the state of the dialogue from step to step.

artificial intelligence, machine learning, natural language, (18 more...)

1611.03218

Country: Europe > Sweden (0.14)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)